Statistical decision making for optimal budget allocation in crowd labeling
نویسندگان
چکیده
It has become increasingly popular to obtain machine learning labels through commercial crowdsourcing services. The crowdsourcing workers or annotators are paid for each label they provide, but the task requester usually has only a limited amount of the budget. Since the data instances have different levels of labeling difficulty and the workers have different reliability for the labeling task, it is desirable to wisely allocate the budget among all the instances and workers such that the overall labeling quality is maximized. In this paper, we formulate the budget allocation problem as a Bayesian Markov decision process (MDP), which simultaneously conducts learning and decision making. The optimal allocation policy can be obtained by using the dynamic programming (DP) recurrence. However, DP quickly becomes computationally intractable when the size of the problem increases. To solve this challenge, we propose a computationally efficient approximate policy which is called optimistic knowledge gradient. Our method applies to both pull crowdsourcing marketplaces with homogeneous workers and push marketplaces with heterogeneous workers. It can also incorporate the contextual information of instances when they are available. The experiments on both simulated and real data show that our policy achieves a higher labeling quality than other existing policies at the same budget level.
منابع مشابه
Statistical Decision Making for Budget Allocation in Crowdsourcing
In this short paper, we briefly describe some recent progress on statistical decision making for budget allocation in crowdsourcing. We address the budget allocation problem for two important labeling tasks in crowdsourcing: the categorization labeling task and pairwise ranking aggregation. We also show the connections between our work and the “proactive learning” framework proposed by Jaime Ca...
متن کاملOptimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing
We consider the budget allocation problem in binary/multi-class crowd labeling where each label from the crowd has a certain cost. Since different instances have different ambiguities and different workers have different reliabilities, a fundamental challenge is how to allocate a pre-fixed amount of budget among instance-worker pairs so that the overall accuracy can be maximized. We start with ...
متن کاملLarge-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing
We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the optimal policy scales with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a KullbackLeibler divergence cost function, we can recast policy optim...
متن کاملBayes-Optimal Effort Allocation in Crowdsourcing: Bounds and Index Policies
We consider effort allocation in crowdsourcing, where we wish to assign labeling tasks to imperfect homogeneous crowd workers to maximize overall accuracy in a continuous-time Bayesian setting, subject to budget and time constraints. The Bayes-optimal policy for this problem is the solution to a partially observable Markov decision process, but the curse of dimensionality renders the computatio...
متن کاملOptimization of Urban Budget Allocation Based on Spatial Justice Indicators (Case: Mashhad Metropolis)
Abstract: One of the main responsibilities of urban managers is to create justice in the area of fair and equal access of citizens to urban services. By objective realization of spatial justice concept, while providing the citizens with the appropriate services, the ground of reducing urban problems is prepared. Spatial justice is one of the main concepts of sustainable urban development. This ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 16 شماره
صفحات -
تاریخ انتشار 2015